contaminated sample
Partial identification of kernel based two sample tests with mismeasured data
Nonparametric two-sample tests such as the Maximum Mean Discrepancy (MMD) are often used to detect differences between two distributions in machine learning applications. However, the majority of existing literature assumes that error-free samples from the two distributions of interest are available.We relax this assumption and study the estimation of the MMD under $\epsilon$-contamination, where a possibly non-random $\epsilon$ proportion of one distribution is erroneously grouped with the other. We show that under $\epsilon$-contamination, the typical estimate of the MMD is unreliable. Instead, we study partial identification of the MMD, and characterize sharp upper and lower bounds that contain the true, unknown MMD. We propose a method to estimate these bounds, and show that it gives estimates that converge to the sharpest possible bounds on the MMD as sample size increases, with a convergence rate that is faster than alternative approaches. Using three datasets, we empirically validate that our approach is superior to the alternatives: it gives tight bounds with a low false coverage rate.
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.46)
- Health & Medicine > Therapeutic Area > Hematology (0.46)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination
Yu, Jongmin, Oh, Hyeontaek, Kim, Minkyung, Kim, Junsik
In this paper, we propose Normality-Calibrated Autoencoder (NCAE), which can boost anomaly detection performance on the contaminated datasets without any prior information or explicit abnormal samples in the training phase. The NCAE adversarially generates high confident normal samples from a latent space having low entropy and leverages them to predict abnormal samples in a training dataset. NCAE is trained to minimise reconstruction errors in uncontaminated samples and maximise reconstruction errors in contaminated samples. The experimental results demonstrate that our method outperforms shallow, hybrid, and deep methods for unsupervised anomaly detection and achieves comparable performance compared with semi-supervised methods using labelled anomaly samples in the training phase. The source code is publicly available on https://github.com/andreYoo/
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > South Korea > Daejeon > Daejeon (0.04)
Robust MIL-Based Feature Template Learning for Object Tracking
Lan, Xiangyuan (Hong Kong Baptist University) | Yuen, Pong C. (Hong Kong Baptist University) | Chellappa, Rama (University of Maryland, College Park)
Because of appearance variations, training samples of the tracked targets collected by the online tracker are required for updating the tracking model. However, this often leads to tracking drift problem because of potentially corrupted samples: 1) contaminated/outlier samples resulting from large variations (e.g. occlusion, illumination), and 2) misaligned samples caused by tracking inaccuracy. Therefore, in order to reduce the tracking drift while maintaining the adaptability of a visual tracker, how to alleviate these two issues via an effective model learning (updating) strategy is a key problem to be solved. To address these issues, this paper proposes a novel and optimal model learning (updating) scheme which aims to simultaneously eliminate the negative effects from these two issues mentioned above in a unified robust feature template learning framework. Particularly, the proposed feature template learning framework is capable of: 1) adaptively learning uncontaminated feature templates by separating out contaminated samples, and 2) resolving label ambiguities caused by misaligned samples via a probabilistic multiple instance learning (MIL) model. Experiments on challenging video sequences show that the proposed tracker performs favourably against several state-of-the-art trackers.
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- Asia > China > Hong Kong (0.04)